Rhythm of the Night (and Day): Predictive Metabolic Modeling of Diurnal Growth in Chlamydomonas

ABSTRACT Economical production of photosynthetic organisms requires the use of natural day/night cycles. These induce strong circadian rhythms that lead to transient changes in the cells, requiring complex modeling to capture. In this study, we coupled times series transcriptomic data from the model green alga Chlamydomonas reinhardtii to a metabolic model of the same organism in order to develop the first transient metabolic model for diurnal growth of algae capable of predicting phenotype from genotype. We first transformed a set of discrete transcriptomic measurements (D. Strenkert, S. Schmollinger, S. D. Gallaher, P. A. Salomé, et al., Proc Natl Acad Sci U S A 116:2374–2383, 2019, https://doi.org/10.1073/pnas.1815238116) into continuous curves, producing a complete database of the cell’s transcriptome that can be interrogated at any time point. We also decoupled the standard biomass formation equation to allow different components of biomass to be synthesized at different times of the day. The resulting model was able to predict qualitative phenotypical outcomes of a starchless mutant. We also extended this approach to simulate all single-knockout mutants and identified potential targets for rational engineering efforts to increase productivity. This model enables us to evaluate the impact of genetic and environmental changes on the growth, biomass composition, and intracellular fluxes for diurnal growth. IMPORTANCE We have developed the first transient metabolic model for diurnal growth of algae based on experimental data and capable of predicting phenotype from genotype. This model enables us to evaluate the impact of genetic and environmental changes on the growth, biomass composition and intracellular fluxes of the model green alga, Chlamydomonas reinhardtii. The availability of this model will enable faster and more efficient design of cells for production of fuels, chemicals, and pharmaceuticals.

Review for article mSystems00176-22 "Rhythm of the Night (and Day): Predictive metabolic modeling of circadian growth in Chlamydomonas"

Summary:
This article presents an implementation of time-resolved transcriptomic data with flux balance analysis (FBA) to produce a dynamic version of expression-based stoichiometric metabolic modeling. Using discrete measurements of transcripts from another study, the authors developed continuous functions estimating transcript level (testing several different function estimations, assessing and determining the best fit for each gene). They implemented these transcript expression functions as additional constraints in the FBA framework to iteratively adjust the bounds for each metabolic reaction according to expression at each time step, also incorporating the capacity for changing biomass composition at each time step. The authors show that they can model the growth of wild type Chlamydomonas reinhardtii under diel conditions using the given data, and they also investigate the predicted effects of different gene knockouts, focusing mainly on predictions of the sta6 mutant which is unable to accumulate starch.
While the predictions using the gene expression-based dynamic FBA and the resulting flux map animations are certainly impressive, I think the impact and application of the article can be improved through the modifications suggested below, and the connection to the sustainable energy field can be strengthened.

Comments:
1. Alternate phrasing should be considered for the title in place of "circadian growth"either "circadian rhythm" or "diurnal growth", by convention. 2. The introduction focuses on the impact of light-dark cycles on economical growth of algae for sustainable fuels. However, the predictions show that WT C. reinhardtii is not expected to accumulate lipid droplets under nitrogen limitation in diurnal light as they are observed to do under constant light, which seems to undercut the rationale for pursuing the work presented in the introduction. The discussion and conclusion should more fully address this point and expand upon the potential applications. 3. It would be helpful for the introduction to include a bit more information on what kinds of algal production strains have been developed to date, to more clearly emphasize how the presented work can expand upon what has already been done. 4. The previous study from which the transcriptomic data was obtained is cited (Strenkert et al., 2019), but the experimental scenario should be explained further beyond the 12:12 light:dark criterion to better understand the conditions that the gene expression data represents (e.g., light intensity, CO 2 availability, nitrogen source, etc.). 5. It is noted in line 115 that the sudden onset of light in the experimental conditions does not accurately mimic a natural environment -a more in-depth exposition of how this factor might affect the model results and the metabolic patterns interpreted from the model results would be helpful. Would simulation predictions be expected to be quite different? 6. The role of photoinhibition / photorespiratory pathways is not mentioned, yet these pathways are likely very important as high light intensities are commonly encountered during mid-day in natural environments. A discussion of the impact of photorespiration on productivity would be helpful. 7. A few details of the model implementation remain unclear: what time scale resolution is used in the dynamic FBA simulations? How is the changing objective function decided at each iteration (the initial objective is explained in lines 479-484, but the process for updating the biomass composition upon each iterative cycle is vague)? How realistic is it to use a changing biomass equation if it allows growth without requiring all biomass metabolites (line 437-439)? 8. The gene-reaction rules AND and OR described in lines 402-405 seem to be defined backwards. AND would imply that both genes are used, which would involve a sum of the fluxes; OR would imply one or the other reaction, which would involve a minimum of the two fluxes. Additional explanation of this logic would help clarify. 9. The knockout evaluation methods described in lines 495-497 state that a 168-hour simulation assumed a single doubling over that time frame, which seems extremely slow and conflicts with the information given in the Figure 3 legend which gives a doubling time of 24 hours; this apparent inconsistency should be clarified. 10. The arrangement of Figure 3C makes it confusing to understand the role of the two lines for sta6 and WT on the graph. It seems strange to show the WT when the x-axis represents knockouts, and they appear to have a final / initial mass value of 4, which is not possible for the WT according to the figure caption (reports a final / initial mass value of 2). 11. The modeling of mutant phenotypes presents some results that could be predicted as possible which are not actually feasible and would require experimental testing, due to the variable biomass composition and the way the biomass components are defined. A few examples of detecting false positives are listed, but what percent of false positives might be expected with this method? It represents a potentially quite cumbersome process to evaluate and check results, so addressing the impact of false positives would be important for assessing how useful mutant simulations are. Could it be a potential solution to employ a standard formulation FBA model and simply run two separate simulations (one with light and one without light) to test the feasibility and avoid some of these false positives? 12. The conclusion / discussion should better address how the presented method can be extended or applied to other organisms. Currently the presented work serves as a very nice case study with a specific data set, but translation to other organisms and clarifying the utility for bioengineering applications as stated in the introduction is important to improve the impact. For example, what is the time scale resolution of discrete transcriptomics measurements that is needed to accurately model the transcript changes? The data set used was two hours apart, but many experiments are not able to use such a fine level of resolution -how much molecular data is needed to be useful? 13. Several typographical and grammatical errors throughout the document still require attention: e.g., "outcomes [of] a starchless mutant" (line 22), spell out Arabidopsis (line 74), avoid use of contractions like "don't" (line 77), run-on sentences (lines 77-78, 112), "constraint-based" rather than "constraints-based" (line 88 and others), "dynamic model's" rather than "dynamic's model" (line 187), "Phtyozome" (line 239), "Kronkecker delta" (line 294), "vectorize" rather than "vectorized" (line 331), among others.
Response to Reviewers for mSystems00176-22 "Rhythm of the Night (and Day): Predictive metabolic the modeling of circadian growth in Chlamydomonas" Below is our point by point response to reviewers. Our response is shown in red text. Text copied over from the manuscript for ease of review is shown in italics, while line numbers from the marked-up manuscript pdf are included for ease of searching.

Notes from editor:
• I think the figure quality is in fact very good, but please please look critically at the text size and readability in Figure 4.
We have reexported Figure 4, but we should be clear that the smaller figures in the main figure are more for aesthetics than for interpretation so we weren't necessarily concerned with readability. If the editor would like us to remake this figure with larger text in the subfigures, we are happy to.
• Also there are more than 10 supplementary documents. Please reduce these to the minimum possible number, which to me looks like three (movie, Document S1 and combined sheets).
We have condensed the excel files into one file so now there are only 3 supplemental files: the word document, the excel file and the movie REVIEWER 1: Summary: This article presents an implementation of time-resolved transcriptomic data with flux balance analysis (FBA) to produce a dynamic version of expression-based stoichiometric metabolic modeling. Using discrete measurements of transcripts from another study, the authors developed continuous functions estimating transcript level (testing several different function estimations, assessing and determining the best fit for each gene). They implemented these transcript expression functions as additional constraints in the FBA framework to iteratively adjust the bounds for each metabolic reaction according to expression at each time step, also incorporating the capacity for changing biomass composition at each time step. The authors show that they can model the growth of wild type Chlamydomonas reinhardtii under diel conditions using the given data, and they also investigate the predicted effects of different gene knockouts, focusing mainly on predictions of the sta6 mutant which is unable to accumulate starch.
While the predictions using the gene expression-based dynamic FBA and the resulting flux map animations are certainly impressive, I think the impact and application of the article can be improved through the modifications suggested below, and the connection to the sustainable energy field can be strengthened.
Comments: 1. Alternate phrasing should be considered for the title in place of "circadian growth"either "circadian rhythm" or "diurnal growth", by convention.
Thank you for pointing this out. We have edited the entire document to either have circadian rhythm or diurnal growth (including the title).
2. The introduction focuses on the impact of light-dark cycles on economical growth of algae for sustainable fuels. However, the predictions show that WT C. reinhardtii is not expected to accumulate lipid droplets under nitrogen limitation in diurnal light as they are observed to do under constant light, which seems to undercut the rationale for pursuing the work presented in the introduction. The discussion and conclusion should more fully address this point and expand upon the potential applications.
Yes! This is an excellent example of why we need dynamic diurnal modeling because what has been tried in the past relies on physiological responses that have been characterized in continuous light, not diurnal light. Our model (as shown in the conclusion) is able to identify other approaches which may lead to the accumulation of lipids in algae in diurnal light. We have made this more clear in the introduction, see lines 71-82. Cheah et al. (15). This also means that engineering strategies that have been shown to result in increased productivity in lab conditions will not directly translate to increased productivity in diurnal growth. Therefore, it is imperative that we develop tools that will enable more predictive and rational engineering of algal cells in diurnal growth.

Currently, almost all metabolic engineering efforts in algae and cyanobacteria rely on growth in laboratory conditions with a continuous supply of light (9-22). This results in a steady state growth environment that more closely mimics that of heterotrophic bacteria and enables more straight forward design and engineering of cells. However, large scale growth of photosynthetic organisms necessitates growth in diurnal conditions outdoors, and the strong circadian rhythms that lead to dynamic gene expression can confound engineering efforts, as was reported by
3. It would be helpful for the introduction to include a bit more information on what kinds of algal production strains have been developed to date, to more clearly emphasize how the presented work can expand upon what has already been done.
The introduction already has a number of citations for metabolic engineering efforts in photosynthetic organisms (see lines 73, 77, 80). As far as we know, an engineered strain of algae has not yet been deployed for large scale growth outdoors; most industrial production of lipids or nutraceuticals use wild type strains. There are industrial strains of engineered algae that are grown heterotrophically for production of designer lipids, but that is not the intended used of our model since heterotrophic growth is steady state.
We have added an additional line in the introduction to note this (see lines 77-82) This also means that engineering strategies that have been shown to result in increased productivity in lab conditions will not directly translate to increased productivity in diurnal growth and is one reason why most current industrial algal production uses wild type strains (23). 4. The previous study from which the transcriptomic data was obtained is cited (Strenkert et al., 2019), but the experimental scenario should be explained further beyond the 12:12 light:dark criterion to better understand the conditions that the gene expression data represents (e.g., light intensity, CO 2 availability, nitrogen source, etc.).
Thank you for the feedback; we've added some experimental information in lines 502-505, reproduced below: To build this data-driven model of Chlamydomonas reinhardtii, we used published transcriptomic data from cells grown in 12:12 day night cycles in a mixed photobioreactor with ambient air bubbling, replete nitrogen, and 200 µE light (1).
Should the reader want more info about the experimental conditions, they can read the referenced paper.
5. It is noted in line 115 that the sudden onset of light in the experimental conditions does not accurately mimic a natural environment -a more in-depth exposition of how this factor might affect the model results and the metabolic patterns interpreted from model results would be helpful. Would simulation predictions be expected to be quite different? 8. The gene-reaction rules AND and OR described in lines 637-642 seem to be defined backwards. AND would imply that both genes are used, which would involve a sum of the fluxes; OR would imply one or the other reaction, which would involve a minimum of the two fluxes. Additional explanation of this logic would help clarify.
We apologize if this point wasn't clear. The gene-protein-reaction rules operate off Boolean logic applied to the transcriptomic data, not the reactions. AND means that both gene products are required to make the enzyme needed to catalyze the reaction(s), and therefore, the bound is set by the minimum of either gene. OR means that either gene product is sufficient to produce the enzyme to catalyze the reaction, therefore the bound is set by the sum of both genes. This relationship has clarified in the manuscript in lines 563-568 (reproduced below), and is also elaborated in the article that is cited in the section in question.

When a reaction is catalyzed by a multimeric enzyme requiring the simultaneous expression of multiple genes, then the bound for said reaction is calculated by setting it to the minimum expression level of all the required genes, defined by the Boolean logic AND. Conversely, some reactions can be catalyzed by multiple enzymes; in this case, the reaction bound is calculated by proportionally setting it equal to the sum of transcript abundance of all associated gene products (defined by the Boolean OR).
9. The knockout evaluation methods described in lines 495-497 state that a 168-hour simulation assumed a single doubling over that time frame, which seems extremely slow and conflicts with the information given in the Figure 3 legend which gives a doubling time of 24 hours; this apparent inconsistency should be clarified.
Thank you for mentioning a possible point of clarification here. Light availability is a changeable constraint in this simulation, so we lowered the light availability to place more stress upon the cells and reduce the energy budget, thus making it more clear which cells might be suffering from significantly detrimental mutations. We added a similar note in the manuscript, lines 777-785, which are reproduced below: The knockout mutants were simulated for 168 hours, a full week, at light and nongrowth-associated maintenance ATP light levels that produced approximately one doubling in the wild type cell over the length of the simulation. While these conditions differ from other simulations, they were selected for specific reasons: the long growing period gives the cells time to acclimate to the mutants and stabilize their own control loops, while the reduced available energy means that impacts of the mutations can more clearly be seen.
10. The arrangement of Figure 3C makes it confusing to understand the role of the two lines for sta6 and WT on the graph. It seems strange to show the WT when the x-axis represents knockouts, and they appear to have a final / initial mass value of 4, which is not possible for the WT according to the figure caption (reports a final / initial mass value of 2).
We appreciate the opportunity to provide some graphical clarification. The lines point to where the mutants are located in the graph. WT is included as a strain for the purposes of comparison. Clarified in manuscript, line 987:  (FAD). This redox coenzyme is only required in small amounts in the biomass equation, but is large and therefore metabolically costly to produce. Mutants that cannot make it are predicted to grow faster, but this result is unlikely to be experimentally borne out.
12. The conclusion / discussion should better address how the presented method can be extended or applied to other organisms. Currently the presented work serves as a very nice case study with a specific data set, but translation to other organisms and clarifying the utility for bioengineering applications as stated in the introduction is important to improve the impact. For example, what is the time scale resolution of discrete transcriptomics measurements that is needed to accurately model the transcript changes?
The data set used was two hours apart, but many experiments are not able to use such a fine level of resolution -how much molecular data is needed to be useful?
Thank you for your feedback on these topics. We've added more information in two locations; a note on broader applicability has been added to the conclusion (lines 419-488, reproduced below), while a note on data resolution has been added the methods in lines (lines 668-670): When that predictive power is combined with the inherently parallelizable nature of in silico modeling, it is possible to quickly assess bioengineering approaches for feasibility, as demonstrated by individual simulations of a broad swath of single knockout mutants. Additionally, this approach requires relatively few discreet types of data -the primary requirements are only a pre-existing constraint based model, an appropriate transcriptomic data set, and a timecourse of biomass. There is no other organism specific information required. Because of this, our approach can be generalized to any other photosynthetic algae species with similar data availability.
It is worth noting that the confidence of this approach is a function of the collection resolution; faster and more transient events demand smaller intervals between data points, as implied by the Nyquist-Shannon sampling theorem.

REVIEWER 2
This manuscript reports the development of a dynamic metabolic model for diurnal growth of Chlamydomonas reinhardtii based on experimental data that predicts phenotype from genotype. It is very relevant because the implementation of this model, that includes the impact of genetic and environmental changes on the growth, biomass composition and intracellular fluxes, will allow faster design of processes for the production of high-value compounds and biofuels.
Comments to authors: 1. Check the grammar. I am not English native speaker but I found some mistakes in the grammar.
Thank you for noting this -we have re-read the document and corrected grammar mistakes.
2. Which gene(s) did you use as housekeeping gene for transcript abundance normalization then? Did you check the actin gene? Is the most used for normalization in Chlamy.
The transcriptomic data was obtained from the published paper: As it is published in PNAS and the corresponding author (Sabeeha Merchant) is a world expert in the use of RNA-Seq for quantifying transcript abundance in Chlamydomonas (and other algae), we didn't feel the need to reprocess the data but took it as published.
We did actually look at the expression levels of the well-known Chlamydomonas housekeeping gene RACK1 in the data (see Figure 1) and it extremely stable across the timecourse of experiments.
3. The quality of figures are very poor or at least it's what I see in the pdf.
We apologize for this -we believe the conversion of the PDF was bad so we have re-exported and for the final draft the figures will be uploaded as high resolution .tif files.
4. Figure 2 and animated figure are too messy. It's there any other better way to show the metabolic fluxes? It does not look good with these thick arrows.
Thank you for your feedback. It is very difficult to visualize metabolic pathways due to the their complexity, and the added variable of time makes it even more complex. Visually speaking, we believe (and it is the convention for fluxes) it is easier to see changes in the metabolism when the arrow thickness changes. 5. This model is for Chlamy, a model algae, but would it work in other commercial algae?
Indeed! Assuming the data and model exists, it is absolutely a feasible conversion. The conclusion has been edited to reflect this in lines 419-488, reproduced below.
When that predictive power is combined with the inherently parallelizable nature of in silico modeling, it is possible to quickly assess bioengineering approaches for feasibility, as demonstrated by individual simulations of a broad swath of single knockout mutants. Additionally, this approach requires relatively few discreet types of data -the primary requirements are only a pre-existing constraint based model, an appropriate transcriptomic data set, and a timecourse of biomass. There is no other organism specific information required. Because of this, our approach can be generalized to any other photosynthetic algae species with similar data availability.
6. Figure 3A and C, overall mass axis needs units? 8. Figure 4 legend. Would be good to have a description of the figure here.
Thank you for noting this. We've added an image description to the caption (lines 991-1001) and reproduced it below; we also exported it as a higher resolution pdf as well. Dear Prof. Nanette R Boyle: Thank you for addressing the reviewers comments with rigor. They were both satisfied. I note that there are several text errors in the paper: for example, L114 "associated expression profiles in Error! Reference source not found" Please address these in the proofs.
Your manuscript has been accepted, and I am forwarding it to the ASM Journals Department for publication. For your reference, ASM Journals' address is given below. Before it can be scheduled for publication, your manuscript will be checked by the mSystems production staff to make sure that all elements meet the technical requirements for publication. They will contact you if anything needs to be revised before copyediting and production can begin. Otherwise, you will be notified when your proofs are ready to be viewed.
ASM policy requires that data be available to the public upon online posting of the article, so please verify all links to sequence records, if present, and make sure that each number retrieves the full record of the data. If a new accession number is not linked or a link is broken, provide production staff with the correct URL for the record. If the accession numbers for new data are not publicly accessible before the expected online posting of the article, publication of your article may be delayed; please contact the ASM production staff immediately with the expected release date.
As an open-access publication, mSystems receives no financial support from paid subscriptions and depends on authors' prompt payment of publication fees as soon as their articles are accepted.

Publication Fees:
You will be contacted separately about payment when the proofs are issued; please follow the instructions in that e-mail. Arrangements for payment must be made before your article is published. For a complete list of Publication Fees, including supplemental material costs, please visit our website.
Corresponding authors may join or renew ASM membership to obtain discounts on publication fees. Need to upgrade your membership level? Please contact Customer Service at Service@asmusa.org.
For mSystems research articles, you are welcome to submit a short author video for your recently accepted paper. Videos are normally 1 minute long and are a great opportunity for junior authors to get greater exposure. Importantly, this video will not hold up the publication of your paper, and you can submit it at any time.
Details of the video are: · Minimum resolution of 1280 x 720 · .mov or .mp4. video format · Provide video in the highest quality possible, but do not exceed 1080p · Provide a still/profile picture that is 640 (w) x 720 (h) max · Provide the script that was used We recognize that the video files can become quite large, and so to avoid quality loss ASM suggests sending the video file via https://www.wetransfer.com/. When you have a final version of the video and the still ready to share, please send it to mSystems staff at msystems@asmusa.org.
For mSystems research articles, if you would like to submit an image for consideration as the Featured Image for an issue, please contact mSystems staff at msystems@asmusa.org.
Thank you for submitting your paper to mSystems.